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1. INTRODUCTION 

Since last 50 years, as a result of advancement in semiconductor technology, scaling continues from 
today’s 16nm feature size to Inm feature size expected in 2028 [1]. This enables to integrate more number of 
IP cores in a single system on chip. With the growth of number of cores, communication demand between the 
processing cores increases. This may require high communication bandwidth with low latency, low power 
consumption and high scalability network. The conventional bus based architecture will not meet these 
requirements and this lead to communication performance bottleneck. A solution for such a communication 
bottleneck is network on chip to improve the performance for many core systems [2]. As compared to 
previous works presented in [3], [4], NoC is the popular interconnection infrastructure for many core inter 
communication because of its high throughput, low latency, scalability and reusability. NoCs are composed 
with three components such as router, links and network interface (NI). Routers are the switching elements 
that are responsible for forwarding the data packets from one router to another one. 

Links are the connection parts between different nodes and they are usually bidirectional network 
interfaces, which acts as the wrapper between the router and processing elements (PE). Routers will take the 
routing decision based on the routing algorithm. In NoC based multiple core systems, the negative aspects of 
technology scaling may increase the probability of chip defects introduced which may be either in 
operational or in manufacturing phases. These faulty NoC systems may have defects in processing elements 
(PE) or routers or interconnects. Due to the faulty interconnects and routers, the number of routing paths are 
reduced, which results in unbalanced traffic distribution and more traffic congestion [5]. The lack of non 
local fault awareness leads to performance degradation in NoC. The performance parameters are becoming 
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important aspects in multiple core chip design. In this proposed design the reconfigurable high performance 
secured NoC design using hierarchical Agent based monitoring system provides a promising solution to 
address the above issue [6], which is suitable for large multi core systems with hundreds of processing 
elements. In this design, agents are distributed hierarchically to accumulate, distribute and manage the faulty 
information along with security using random arbiter router with XY routing algorithm. 

The previous works are related to the hierarchical agents found in [7]-[11]. In [6] and [11] the 
overall structure of agent based management system is discussed without any detailed design. The 
hierarchical agents are used in [8] and [7] to monitor the power consumption in NoC using DVFS 
(differential voltage and frequency scaling) technique. In [10] an agent based management method is used to 
enhance the performance of NoC based multi core system on chip design against the faults or failures 
resulted in the neighbor nodes in addition to their own components and interconnection links. These agents 
inform the routers about different faults in the network which helps the routing process to be more scalable 
using XY routing algorithm and also to improve the performance. However, still many issues need to be 
addressed. Previous works are limited to 4x4 agent based NoC, non reconfigurable and non secured agents. 
The arbiters used in the previous router are not servicing the packets equally in all directions of the node and 
it serves the packet according to the priority which may lead to increase in packet staking in one direction. 
The agent provides only the congestion and healthy status of the network. 

In the proposed design all these limitations are addressed, by introducing reconfigurable NxN 
hierarchical agent based NoC with random arbiter router using XY routing algorithm, which overcomes the 
packet stacking by servicing the packet randomly, which avoids loss of packets and improves the memory 
area. The agent functionality is further enhanced to work as an information provider and also take decision 
for packets to either pass or stop to the processing element by setting the firewall which intern provides 
security. Section 1.1 discusses about the existing literatures where different techniques are discussed for 
detection schemes used in power transmission lines followed by discussion of research problems in Section 
1.2 and proposed solution in 1.3. Section 2 discusses about algorithm implementation followed by discussion 
of result analysis in Section 3. Finally, the conclusive remarks are provided in Section 4. 


1.1. Background 

This section discusses about the existing approaches for solving the identification problems of 
network related faults. The work carried out by Santos et al. [12] has presented a mechanism to identify 
maximizd impedence faults using discrete wavelet transform. Study toward identification of real-time faults 
has also been carried out by Pignati et al. [13] over similar distributed network using state-based estimation 
technique. Similar form of approach was also implemented by Nikander and Jarventausta [14] for network 
fault identification. Considering a case study of spacecraft, Raiteri and Portinale [15] have used Bayesian 
network for identifying and mitigating faults occurring over spacecraft. Research towards explicit analysis of 
behaviour of a packet is carried out by Wang et al. [16] using a unique form of classification technique. 
Adoption of probability theory has been used for developing a framework for identifying faults over sensory 
application as witnessed in the work of Ntalampiras [17]. 

The authors have used Hidden Markov Model for this purpose. Zhang and Zhang [18] have used 
graph-based approach for developing a framework of fault identification taking the case study of satellite 
network. The occurances of network fault is also investigated over an optical network by Amaral et al. [19] 
where the authors have used specific device to accomplish the task. Similar study towards optical network 
has been also studied by Zhu et al. [20], where a mathematical modeling has been utilized for developing two 
dimensional coding-monitoring systems. Adoption of Bayesian network is again seen for the work carried 
out by Cai et al. [21]. There have been also studies towards developing fault tolerance system in existing 
literature. Considering the case study of chip switching, Kohler et al. [22] have developed an fault tolerant 
model for improving Network-on-chip performance. Vall et al. [23] have developed an estimation technique 
of faults occurring in sensory network. 

Yao et al. [24] have linear state feedback mechanism for developing a controller system of 
significant faults occurring over network architecture. Eghbal et al. [25] have carried out anlaysis of network- 
on-chip architecture for overcoming various hardware related issues on chip design. Ren et al. [26] have 
presented an adaptive communication strategy to overcome faults for mitigating deadlock condition. 
Shuwaili et al. [27] have discussed about fault tolerance mechanism for network function virtualization using 
coding-based approach. Similarly Pereira et al. [28] and Wu et al. [29] have also presented a mechanism of 
fault tolerance system for chip and sensor nodes respectively. Therefore, it can be seen that there are various 
reserahers who have already carried out studies towards improving the performance of fault tolerance 
associated with the network system especially the chip-based architecture. Each approach has their own 
uniqueness as well as limitation. The next section outlines the problems associated with the existing research. 
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1.2. Research Problem 
The significant research problems are as follows: 
a. Existing research towards fault tolerance doesn’t emphasize on the scalability while evolving up with 
fault tolerant protocol over network design. 
b. None of the existing studies towards NoC has highlighted any design issues with its processing elements 
that offer latent faults in any network architecture. 
c. Although existing studies have worked on fault identification but there are less number of studies 
towards classifying the faults existing over the networks. 
d. The mechanism of formulating the decision in ensuring better performance of fault tolerance network is 
not clearly defined in any existing studies. 
Therefore, the problem statement of the proposed study can be stated as “Developing a cost effective 
modeling to encapsule comprehensive network faults with equivalent focus on packet-level controlling 
mechanism in chip architecture is computationally challenging.” 


1.3. Proposed Solution 

The prime aim of the proposed system is to develop a simple and novel approach that can optimize 
the performance of the network by performing integrated operations over the network. With an aid of an 
analytical modeling, the proposed system performs a series of operation e.g. i) identification of faults, ii) 
identification of traffic bottleneck conditions, iii) incorporating pacey-level security, and iv) effective 
monitoring of the ongoing communication. The proposed system acts like a complimentary model to assists 
the router for formulating a precise decision. The schema of the proposed system is as shown in Figure 1. 


Bypass/neglect 
specific target 
from packet 


RAM 
(16x16 Bytes) 


: aon ount Up / Down 
Config. Monitoring p 
Register 


Control 
Packet 
Figure 1. Schema of Proposed Method 


The proposed scheme assists in aggregating, managing, and distributing the information related to 
network faults using Local Fault Register (LFR) while it takes the help of Regional Fault Register (RFR) for 
performing updating operation on its neighboring nodes. The cell agent exchanges the congestion 
information bidirectional between the agents by using same link inside the dedicated network [30]. The 
congestion information or the fault information is determined by the agents with the help of encoding and 
decoding process. This cell agent will provide the security to the processing element using config register and 
control packet stage. Config register is used for source port configuration (using lookup table concept) in 
order to block the unwanted and unrelated packets to give security (like blocking the website or virus 
packets). Control packet will get the authorized packet information from the config register and decides 
whether packet must be passed or not to the processing element. In general, people can hack the secured 
firewall, but in the proposed design, some of the port addresses are itself blocked in the hardware ( i.e inside 
the chip), which avoids the intruder by hacking the firewall. The cell agent will ignore or bypass some of the 
packets, if those packets contain video or audio related data using bypass register. The agents will also 
monitor the maximum sessions per node using session monitoring stage. This session monitoring stage will 
take care of start session and close session (limited to 0-31 sessions) after performing the task. The next 
section outlines about the algorithm used for this purpose. 
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2. ALGORITHM IMPLEMENTATION 

This information related to the fault in the network is quite useful for the router to formulate an 
effective decision making during routing. The fault detection circuitry in the agent will provide the fault 
information of the network. A NoC router is assumed to contain a priority encoder, random arbiter and the 
crossbar switch. As compared to technique in existing system [10], [31], [32], in this paper the proposed 
router design buffers are avoided to reduce the hardware overhead and to improve the performance. The 
priority encoder which selects the inputs according to the select signal originates from the random arbiter. 
This proposed router will serve the packets randomly without any loss or stacking of the packets. With 
reference to [7], [33] fault detection circuitry is adopted in the NoC router and the links are used to detect the 
permanent faults on the network with an acceptable hardware overhead. 

The fault detection circuitry will provide the appropriate signals, which gives the information of 
fault awareness related to random arbiter, priority encoder, crossbar switch and all links in four direction of 
each router. In addition to this, it also provides information of faultiness of other components such as 
Processing Element (PE) or core and Network Interface (NI). In the proposed design, all the links of the 
network are bidirectional and if any permanent fault occurs in any one direction, then the entire link will be 
treated as faulty. Assume a south direction router is faulty or unavailable for routing process, only if the south 
link or south input pin of the current node or the north input pin of the south neighbor router is faulty. This 
condition is stated in equation (1) using fault detection stage generated signal. 


S=Link, or In_Ports™™" -rouer or In_Porty oe" (1) 


In equation (1) all the terms are one bit status, if any term is equal to ‘1’ then respective component is faulty, 
else it is healthy. In any router if the input pin is faulty then it can be modeled by assuming its link is faulty. 
equation (2) is basically used for all the four directions of the router. 


n=Link, or In_Port, "2er or In_Porta-n (2) 


In (2) ‘n’ can be a E, W, N or S i.e East, West, North or South directions, respectively. Link, shows the status 
of current router i.e bidirectional link in the ‘n’ direction. In_Port,“""-*°“*" gives the status of the input pins of 
priority encoder in the ‘n’ direction of the current router and In_Porta-n)"-"™® gives the status of neighbor 
router input port to which the opposite direction of n and placed in the n direction of the current router. (1-n) 
indicates the opposite direction of n, which means N, S, E and W for S, N, W and E directions, respectively. 
If any one of the component inside the router is faulty, then entire router is considered as faulty. Once the 
router is faulty it is not available to do its task (i.e routing the packets from input to its corresponding output 
port). One bit information of LFR is used to indicate the faultiness of the node which is labeled as Node. 
equation (3) determines the status of the Node/Router. 


Node=Priority_encoder or Random_arbiter or Crossbar_Switch (3) 


In equation (3) if any one of the above term is faulty, then entire node is considered as faulty node. In 
multiple core networks on chip, the processing elements are connected to network via the network interface. 
If the PE is not working then platform level will automatically remap that packet into some other core on the 
network according to healthy status information. Equation (4) says that if PE is ‘1’ then it is considered as 
faulty or its network interface or the local link is connected between the router and PE is faulty then PE 
becomes unavailable. 


PE = PEtoca or NI or Linkocat (4) 


Fault informations are determined using equation (1) to (4), which is useful for routing process in order to 
improve the performance by avoiding dead lock and live lock situation. This fault information is classified 
and transferred to the top level of the system to map the packet into the healthy node which in turn improves 
the fault tolerant capability and the cost of routing algorithm [34]. Such local fault information is stored in 
the LFR. The local fault register is 8 bit in size. In this 6 bits are used for indicating the faulty status and 2 
bits for future enhancement. Further in these six bits, the LFR uses four bits to update the status of four input 
pins of the router, which helps neighbor nodes to update their local fault registers according to equation (2). 
The remaining 2 bits are used to update the status of Node and PE. Assume that center node is the current 
node and it is having four neighboring nodes. 

The current node updates its own component fault information in LFR and also updates the 
neighboring fault information with the help of RFR. The RFR is 8 bit in size, in this four bit is used to update 
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the neighbor node fault information and remaining four bits are used for future enhancement. 
Equations (1) to (4) will update the LFR of all the nodes. This LFR will help to update the RFR of the entire 
neighboring node with new faulty information. Accordingly the center node will have the faulty information 
of N, E, W and S sides of the nodes. If any one bit of LFR of the north side node is equal to one, then the 
current node RFR will updates the ‘NN’ bit to one and the remaining bits will be equal to zero, which says 
that the north side node is unhealthy and the packets should not be sent towards north node if the destination 
is top right node. 

The proposed hierarchical agent structure is as shown in Figure 2. Each and every cell, cluster agent 
gets updated with new fault information of its own cluster cell and the neighboring cluster cells 
bidirectionaly [6]. Such fault information is sent to the top level of the system, which help to map the packet 
to healthy node by selecting best path. Cluster Separation Module (CSM) helps the packet to reach its 
respective cluster agent by considering cluster selection bits on the packet. Then the cluster will route the 
packet into the respective cell agent according to the routing information in the packet. Finally agent will 
decide whether the packet has to pass or stop into the processing element by providing the security in the 
agent. 


Application Level 


Platform Level 


Cluster 
Separation 
Module 


"i 


Cluster 
Agent 


Cell f Cell 
Agent 


Cell 


Two cluster agents 


Figure 2. Hierarchical agents in two neighbor clusters 


A 4X4 agent based NoC as shown in Figure 3 includes processing element, network interface, router 
and agents (these agent can be ether a cell agent or a cluster agent). All the agents are connected 
bidirectionally and one bit information is exchanged between the agents to update the RFR. Later these 
agents are connected to the NoC router network. Packets from the application level enter into the router via 
the agents in order to check the security aspects which will be explained in further section. The proposed 
agent based monitoring system uses two types of communication: namely peer to peer communication (used 
between the agents) and base line data network communication (for controlling and routing the packets in the 
network). 
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Figure 3. A 4X4 agent based NoC 


The cluster agent will accumulate the critical fault information (PE or whole node failure) occurred 
inside the cluster and will send that information to higher level. Then the cluster will receive the command 
from the higher level for reconfiguration or remapping of the packet or task migration. The cluster separation 
module will play an important role to separate the received packet and send it to the respective cluster 
according to the n™ bit of the packet. If the n" bit is zero then the received packet belongs to cluster_] else it 
belongs to cluster_2. If the number of clusters increases then the number of cluster selection control bits must 
also be increased in order to segregates the packet to desired cluster. 

The XY based fault tolerant routing algorithm [33],[34] is incorporated in the proposed hierarchical 
agent based management method. This node is developed based on above discussed mathematical equation. 
This routing algorithm is low cost, adaptive and congestion aware which is suitable for NoC based multiple 
core system on chip. For example consider a 3x3 network in which the center node includes the cluster agent. 
In such network top left node is source node and bottom right is the destination node. The source node should 
be aware of the status of E, S, ES and SE labeled links surrounded by destination node [35],[36]. The faulty 
statuses of these links are not updated in the neighbor node of the source node. Then cluster agent will 
provide this information to the routing algorithm. With the help of this information, the routing algorithm 
will collectively gather all the faulty and congestion information and reach the desired location in the shortest 
path. Algorithm (1) shows the management algorithm used by the Hierarchical Agents. 


Agent Management Algorithm 
Input: Faulty, congestion and security information from cell agents (NA) and neighbor cluster 
agents (CA) 
for each agent do 
Wait until a new congestion or fault information is received; 
If (a node failure or NI fails or PE fails or no control packet is received from a NA within 
the time) then 
inform the top level and its associated node agents; 
receive the packet remapping or task reallocation information; 
segregate the failed PE or Node; 
else if (new fault information received from a CA) then 
inform the new fault and congestion information to neighboring cluster agents and 
associated node agents; 
else if (new fault information from the NA) then 
inform the new congestion and fault information to neighboring node agents within 
the cluster agent; 
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end if 
if(destination address of XY == current address of XY) then 
check the security to decide whether packet has to pass into the destination PE or not; 
if(security check = =0) then 
32 bit packet data will reach the destination Node; 
else packet will be discard; 
end if 


3. RESULT ANALYSIS 

To analyze the importance of the proposed hierarchical secured agent based monitoring system on 
network performance. The 4x4 agent based NoC design using HDL code and simulated using Xilinx ISE 14.2 
tool with ModelSim 6.3f respectively. It is synthesized and implemented on vertex 5 FPGA (XC5vFX70T) kit. 
The performance of proposed method of secured agent based monitoring system is analyzed and compared 
with the existing methods. In the proposed design each cluster is a 4X4 sub network; in this the center node is 
treated as cluster agent. It is assumed that 6.25% of faulty node (one node is faulty out of 16 nodes) and 
20.83% of faulty link (five faulty links out of 24 links) leads to 27% of system fault is as depicted in 
Figure 4(a). 


Figure 4(a). A faulty 4x4 NoC 
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Figure 4(b). Average packet latency analysis 
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Similar assumption is made for both with and without agent based network and the performance 
analysis is as shown in Figure 4(a). From the graph it is observed that, a without agent based XY method does 
not have any means to reliably send all the packets to their destination in the faulty situations, there may be 
chance of packet stuck in the faulty node then packet has to be resent from the top level, which leads to 
performance degradation [33]. In the proposed design with the prior knowledge of all the faulty links and 
nodes the packet will reach the healthy node with a reliable time. 

The proposed secured hierarchical agent-based system leads to higher performance and saturation 
points as compared to method introduced in [10]. As depicted in Figure 4(a) source node S sends the packet to 
destination node D. In order to reach the packet from source node to destination node there are two paths P1 
and P2, among these P1 is the minimal path when compared to P2. However proposed hierarchical secured 
agent will select the minimal path P1 to route the packet using XY routing algorithm. Figure 5 waveform 
shows the node to node packet transfer path between node 2 to node 16 (i.e. N2-N6-N10-N15-N16) as 
highlighted on the waveform. It is synthesized and implemented on vertex 5 FPGA (XC5vFX70T) kit. 
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Figure 5. 4X4 secured agent based NoC Design implementation waveform 


Figure 6(a) and Figure 6(b) the proposed design throughput is compared with the existing functional 
diagnosis method with normal and heavy load condition under the uniform traffic. According to the graph, the 
proposed method has high throughput as compared to the method introduced in [30]. To analyze the area 
overhead of the proposed design, the implemented DyXY [35] is the basic adaptive routing method, an 
adaptive fault tolerant technique RAFT [33], existing Agent based fault tolerant routing algorithm [10] is 
compared with the proposed secured agent based fault tolerant routing algorithm using HDL code with a 
201.74 MHz clock speed. 
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Figure 6(a). Throughput with normal load 
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Figure 6(b). Throughput with heavy load 


Table 1 shows the area utilization of proposed five port router compared to other existing methods. In 
addition to this, the table gives the proposed design area overhead as compared to other methods. Based on the 
hardware analysis table the proposed design area utilization and hardware overhead is 1.4 % improved as 
compared to existing agent based fault tolerant method [10]. It is worth to mention DyXY method doesn’t 
have reliability to transfer all the packets successfully to their destination under the faulty situation 


Table 1. Device Utilization Summary 


: Area utilization (Gate Area Overhead 

Routing Method count) for 5 Port Comparison (%) 
DyXY [35] 36350 12.7 
RAFT [33] 39355 4.1 
Agent based Routing [10] 41574 NA 
Proposed System 40922 1.4 


4. CONCLUSION 

In this paper, a hierarchical secured agent based monitoring system is proposed for fault tolerant 
multi core NoC based system on chip. The hierarchically distributed agent will collect, manage and distribute 
the fault and congestion information of the network to higher level of the system. This fault information helps 
application level to route the packet to healthy node, which will improve the performance of the network by 
avoiding the packet latency against faulty node and links. In addition to this the agent will provide security to 
the PE in order to block the unwanted and unrelated packet entering into the PE which will avoid the live lock 
situation of the high priority packet which is related to the dedicated node. According to the simulation and 
synthesis result, the proposed design will enhance the network performance with an improved hardware 
overhead by using the modified router design. 
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